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Abstract 

This is a case study concerning the rate at which probabilistic coupling occurs 
for nilpotent diffusions. We focus on the simplest case of Kolmogorov diffusion 
(Brownian motion together with its time integral or, more generally, together 
with a finite number of iterated time integrals). We show that in this case 
there can be no Markovian maximal coupling. Indeed, there can be no efficient 
Markovian coupling strategy (efficient for all pairs of distinct starting values), 
where the notion of efficiency extends the terminology of Burdzy & Kendall 
(“Efficient Markovian couplings: examples and counterexamples”. Annals of 
Applied Probability, 2000, 10.2, 362-409). Finally, at least in the classical case 
of a single time integral, it is not possible to choose a Markovian coupling that 
is optimal in the sense of simultaneously minimizing the probability of failing to 
couple by time t for all positive t. In recompense for all these negative results, 
we exhibit a simple efficient non-Markovian coupling strategy. 

Keywords: 

Brownian motion; Brownian time integral; co-adapted coupling; coupling; efficient 
coupling; filtration; finite-look-ahead coupling; hypoelliptic diffusion; immersed cou¬ 
pling; Karhunen-Loeve expansion; Kolmogorov diffusion; Markovian coupling; max¬ 
imal coupling; nilpotent diffusion; optimal Markovian coupling; reflection coupling; 
synchronous coupling 

2010 Mathematics Subject Classification: Primary 60G05 

Secondary 60J60 

1. Introduction 

This paper is written in homage and thanks to our friend and colleague Nick 
Bingham, who has always made it his mission to encourage and to spur on younger 
colleagues. It is a case-study of probabilistic coupling for a particular simple non-elliptic 
diffusion, namely the Kolmogorov diffusion [B, J B dt) (Brownian motion together with 
its time integral), studied for example in McKean’s celebrated stochastic oscillator 
paper [20]. The work forms part of a long-running programme of study of coupling for 
nilpotent diffusions, by which we mean, diffusions with infinitesimal generators of the 
form 

k 

XO + X 
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for smooth vectorfields xo^ Xi) X 2 , ■ ■ ■, Xfc such that the Lie algebra generated by the 
vectorfields is nilpotent (iterated Lie brackets of the vectorfields vanish at a sufficiently 
high order of iteration), so that consequently the diffusion has smooth positive prob¬ 
ability transition densities. (Note that there is a fully worked out structure theory 
for diffusions for which the vectorfields form a nilpotent Lie algebra: see for example 
[18, Section 4.9].) As a case-study this work can be usefully compared with the study 
[16] of probabilistic coupling for scalar Brownian motion together with local time at 
the origin. In this introductory section, we begin by making a careful definition of an 
iterated generalization of {B, f B dt), and then introduce some key coupling concepts. 

1.1. The Kolmogorov diffusion 

The classical two-component Kolmogorov diffusion is obtained by pairing a real 
Brownian motion B with its time integral J B dt. The vectorfields in question are 
Xo = xidfdx 2 and xi = d/dxi, and nilpotence follows from the fact that [xijXo] = 
X 2 = dldx 2 , and noting that [xi:X 2 ] = [xo,X 2 ] = 0. The diffusion {B,J B dt) was 
studied, for example, by McKean [20] as a simple example of a stochastic oscillator; 
see also [1]. Distributional properties have been investigated, for example, in [11, 
26, 13], while estimates are studied in [27] when Brownian motion is replaced 
by a continuous local martingale. It has arisen in statistical studies: see [2] for an 
application to polynomial regression. There are potential applications (for example, 
to model relativistic diffusion of photons [3], also to model the motion of a tracer 
in fluid flow [12]); however its main interest is as a simple model for a non-elliptic 
diffusion. Coupling properties have been studied in [5] and numerically in [12], also 
(when supplemented by further iterated time integrals) in [17]; this problem provides 
the simplest non-trivial example of a diffusion with nilpotent group symmetries which 
admits a Markovian or immersed coupling. Most interest to date has focussed on 
the classical two-component Kolmogorov diffusion. Here we follow [17] in considering 
coupling for (in the most part) the case of index k {k iterated time integrals), since the 
general structure adds clarity to the arguments. 

We begin with an explicit definition. Given B, a standard real-valued Brownian 
motion (hence begun at 0), we define the standard generalized Kolmogorov diffusion 
/ = (/q, /i,..., /fe)^ of index k by the Hnite recursion 

kit) = Bit), 

kit) = ( Ir-iis)ds for r = 1,2,..., A:. (1) 

Jo 

Thus k is simply the r-fold iterated time-integral of Brownian motion. We shall refer 
to the indej^l case as the classical Kolmogorov diffusion, written in column vector 
form as (/o,/i)^ = iB,J B dt)^. 

Nilpotence for / follows by considering the infinitesimal generator 

1 2 
Xo + 2 X 1 

where xi = d/dxi (differentiation in the “Brownian” direction, temporarily deviating 
from the indexing convention of (1) for the sake of convenience of exposition) but 
Xo = a^iX 2 + a^2X3 + . ■ • + Xk-iXk, where Xr = dfdxr- An inductive argument based 
on [xr, Xo] = Xr+i (for r = 1, ... ,k — 1) shows that fc*''-iterated Lie brackets vanish. 
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Bearing in mind the form of the law of /(t+ s) conditional on Kt = a{B{u) : u < t}, 
we define the index k Kolmogorov diffusion /, begun at 2 ; = (xo, a;i,..., Xk)^, using 

~ ff 

Irit) = /r(t) + Xr + tXr — 1 + —Xr-2 + • . . H-ra^O • (2) 

2 r\ 


This definition arises from considering the integral curve of xo determined by the initial 
values ^ = (xo, xi,..., x^)^ of / at time 0, and using linearity. 

The Kolmogorov diffusion is simple enough to allow for explicit calculations, and yet 
exhibits interesting properties from the point of view of probabilistic coupling. Much 
of the interest of the index k Kolmogorov diffusion / lies in the observation that the 
random vector /(t) has a positive continuous density over all for any positive 

time t > 0. This is related of course to Hormander’s hypoellipticity theorem (since / is 
indeed a hypoelliptic diffusion), but can be seen much more directly by calculating the 
covariance structure of I_{t) and noting that the resulting variance-covariance matrix 
is non-singular. This computation is carried out in section 2; but first we introduce 
some relevant concepts from coupling. 

1.2. Maximal couplings 

The technique of probabilistic coupling dates back to Doeblin [8]. However Griffeath 
[10] was the first to prove a remarkable optimization result: it is possible to construct 
a coupling in a maximal way, in the following sense. Given two coupled copies X, X of 
a random process, let T be the (random) coupling time, namely T = inf{t > 0 : Xg = 
Xg for all s > t}. Then a given coupling is maximal if it simultaneously minimizes 
V[T > t] for all t > 0; in particular P [T > t] is then given by the total variation 
distance between the distributions of Xt and Xt- Thus a maximal coupling occurs at 
altogether the fastest possible rate. Griffeath proved the existence of maximal couplings 
for discrete-time Markov chains. Pitman [21] gave an elegant explicit construction for 
homogeneous discrete-time Markov chains on a countable state-space case (and in fact 
his construction generalizes easily), while Goldstein [9] extended the result to general 
discrete-time random processes. For further generalizations see, for example, [24]. 

The notion of coupling contains many subtleties. For example, in general (and in 
contrast to the Markovian case described below) the simpler random time inf{t > 
0 : Xt = Xt} may fail to produce a coupling time; this relates to the notion of 
faithful coupling [22] (note however that constructions of Pitman type deliver maximal 
couplings for which this simpler random time does produce a coupling). We note 
in passing that one can relax the definition of coupling to allow for arbitrary time- 
shifts: this is the notion of shift-coupling [25]. In general it is hard to produce 
explicit constructions of maximal couplings. On the other hand it is much easier to 
build Markovian couplings: couplings which jointly produce a Markov process, whose 
transition probability kernel has as marginals the transition probability kernels of the 
coupled diffusions. There is an important detail here: the marginals must be transition 
probability kernels with respect to the natural filtration of the coupled process. A slightly 
more general notion of coupling, that of immersed coupling (also called co-adapted 
coupling) can be described succinctly and with more clarity: the martingales for the 
filtrations of X and X remain martingales for the joint filtration of (A, X) [16], so that 
the filtrations of X and X are immersed in the joint filtration of X and X. However 
in the following we will restrict ourselves to consideration of Markovian couplings. 
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Note that the above constructions beg the question of whether the coupling time 
T can be chosen to be almost-surely finite, in which case the coupling is said to be 
successful. In the case of the Kolmogorov diffusion it can be shown that successful 
Markovian couplings exist [5, 17]; The extent to which this is the case for a general 
nilpotent diffusion is an interesting open question (but see [5, 14, 15, 16]). 

Markovian couplings are relatively easy to construct and verify (consider for example 
the classic reflection coupling of Brownian motion); however in general they will not be 
maximal (the Brownian reflection coupling provides a rare exception). In the case of 
driftless Brownian motions on rather general spaces, Kuwada [19] showed that maximal 
Markovian couplings only occur in highly symmetrical cases; it is shown in [4] that if 
a smooth elliptic diffusion on a Riemannian manifold admits a Markovian maximal 
coupling then the manifold must be a space form, the diffusion must be Brownian 
motion plus drift, and the drift must arise from a continuous one-parameter group of 
symmetries of the space form (possibly augmented by dilations in the Euclidean case). 
Thus (at least in the elliptic case) Markovian maximal couplings are very rare. 

1.3. Efficient couplings 

We have asserted that we should not expect maximal couplings to be readily con- 
structable. But for practical purposes it will often suffice to obtain a coupling such 
that the probability of failing to couple by time t is comparable (asymptotically in t) 
to the probability of failing to couple maximally by t. 

Efficient couplings were introduced in [6] for the case of Markov chains in which 
there is rapid convergence to a stationary distribution: a coupling is efficient if the 
(presumed exponential) rate of coupling from generic initial states equals the (presumed 
exponential) rate of convergence to stationarity. Here we generalize to cases where 
a stationary distribution need not exist: instead of considering exponential rates of 
convergence, we consider the rate of coupling compared to the maximum possible rate. 


Definition 1. Let Hx,y be a successful coupling of two Markov processes X and Y with 
state space S. Suppose that X and y start from distinct points x,y € S respectively, 
with coupling time t, and let dt{x, y) denote the corresponding total variation distance 
between their distributions. We call y,a:,y an efficient coupling if there exists a positive 
constant C{x, y) such that 


Mx.ylr > t} 

dt{x,y) 


< C{x,y) 


( 3 ) 


for alH > 0 (note that dt{x,y) < Hx,y{T > t} follows from Aldous’ inequality). 

We call the family of couplings {fix,y : x,y € S} axi efficient coupling strategy if fj,x,y 
is an efficient coupling for every pair of distinct starting points x,y € S. 


Note that the constant C{x,y) and the maximal coupling rate may, and often do, 
depend on initial conditions. Note also that it is entirely possible for a diffusion to 
exhibit efficient couplings from some distinct pairs of starting points, while still failing 
to possess an efficient coupling strategy from all possible distinct pairs of starting 
points. (Examples can be constructed using diffusions which are ordered pairs of 
independent component diffusions, of which the first coordinate is efficient, and the 
second inefficient.) 

Even when maximal or efficient Markovian couplings do not exist, it is still possible 
that there may be a Markovian coupling which is optimal, in the sense of simultaneously 
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minimizing P [T > i] for all f > 0 over the class of all Markovian couplings. Such a 
coupling exists in the case of simultaneous coupling of Brownian motion and its local 
time at zero [16]. 

1.4. Questions 

The purpose of this paper is to address the following questions: 

Ql: Is there a maximal Markovian coupling for the generalized Kolmogorov diffusion 
/? The work of [4] is primarily concerned with smooth elliptic diffusions and its 
main results do not directly apply.) 

Q2: If not, is there an efficient Markovian coupling for the generalized Kolmogorov 
diffusion Iff 

Q3: Maximal couplings are typically hard to compute. Is there an efficient non- 
Markovian coupling for the generalized Kolmogorov diffusion / which is relatively 
easy to describe? 

We shall also address the question of whether there might be an optimal Markovian 
coupling, but only in the case of the classic Kolmogorov diffusion (B,f Bdt). We 
restrict here to the classic case; indeed in the case of generalized Kolmogorov diffusions 
/ of index exceeding 2 we know only of implicit and indirect constructions of Markovian 
couplings [17], while construction of a successful Markovian coupling at index 2 is direct 
but requires a somewhat involved analysis. 

Section I describes basic coupling concepts and gives an exact definition of the 
generalized Kolmogorov diffusion. Section 2 carries out some basic calculations for the 
generalized Kolmogorov diffusion, from which is derived the straightforward Theorem 
1, which establishes the rate of maximal coupling. Section 3 establishes a sequence 
of negative results: for the generalized Kolmogorov diffusion it is not possible to 
construct a Markovian maximal coupling (Theorem 2), nor are there any efficient 
Markovian coupling strategies (Theorem 3). Moreover, for the classical Kolmogorov 
diffusion. Theorem 5 (by way of the analytical Theorem 4 giving coupling rates for the 
coupling described in [5]) shows there can be no optimal Markovian strategy (we restrict 
here to the classical case of (B,f Bdt) to avoid considerable potential complexity). 
Section 4 gives a more positive result in the form of Theorem 6; working in the class 
of non-Markovian couplings which look ahead only over bounded intervals of time, 
this theorem exhibits a simple but efficient non-Markovian coupling strategy. Finally, 
Section 5 discusses possible future research directions. 

2. Explicit calculations for the Kolmogorov diffusion 

The Kolmogorov diffusion is a linear Gaussian diffusion, and therefore permits 
explicit calculation. From (2) the Kolmogorov diffusion /(<) of index k, using x as 
initial configuration, can usefully be expressed in terms of the standard Kolmogorov 
diffusion in vector form: 

l(t) = L{t) -b K{t) X . (4) 

Here the (fc -I- 1) x (fc -b 1) lower-triangular matrix H(t) can be written as Hfft) = 
Dit) where Dfft) is the (fc -b 1) x (fc -b 1) diagonal matrix with entries 1, t, 
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..., running down the diagonal, and 


Ha,b 


0 


if a > 6, 
otherwise. 


( 5 ) 


Note that H_{t + s) = H_{t) ^{s). Note also that det_ff = 1, so that and ^{t) (for 
t > 0) are non-singular matrices. 

Consider the Kolmogorov diffusion / of index k and begun at 0. Mathematical 
induction establishes the following linear relationship of Volterra integral type: 

Ik{t) = / — - —B{s)ds forfc>0. 

Jo (*-!)! 

Fixing T, and defining F{T — t) = {T — t)^ /k\ for fc > 0, an application of Ito’s formula 
to F{T — t)B{t) for standard Brownian motion B shows that 

hiT) = £^Z^dB{t). 

In fact this holds for all fc > 0. Applying isometry for Ito integrals, we find that for 
all a,b >0 


E 


IaiT)h(T) 


(T-tf iT-tf 


a\ 


h\ 


dt 


(a + b\ T“+^+i 
\ CL y (a -f 6 -l-1)! 


Thus the variance-covariance matrix V(T) for I{T), equivalently I(T), is given by 
£(T) = T^{T) ^R{T) where ~ 


Va,b 


/a -I- 1 

y a y (a -l- 5 -|-1)! 


for a, & > 0 . 


( 6 ) 


Note that V_ is non-singular: given a vector a of coefficients, the T^-isometry implies 



and this integral is zero only if the polynomial YJJk=o vanishes identically, forcing 
_o = 0. Thus y_ is symmetric positive-definite. 

Consider the Cholesky decomposition of the symmetric matrix _F (unique up to ± 
sign, since V_ is positive-definite). This provides a lower-triangular non-singular matrix 
L such that 

X = LL^ ■ (7) 

Applying (6), (7), and the lower-triangular nature of T, note for future use that the 
top row of L can be taken to be (1,0,... ,0). There follows a representation of the 
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distribution of /(T) in terms of a vector filled with k + 1 independent standard 
Brownian motions: for fixed T we obtain 


/(T) = R{T) x+^W_{T)) . (8) 

(Note however that this equality in distribution cannot be translated into a sample- 
path equality as T varies.) 

Suppose are begun at respectively. Then (8) can be used to 

compute the total variation distance between the Gaussian distributions C 

and These Gaussian distributions have the same variance-covariance 


matrix, and therefore the total variation distance is given by the following expression, 
with z = 


distTv(/:(/^'HT)),/:(/(2)(T))) = p |fv(o,i)| < 

\\L-^ H D{T-^)z\\ 

^ 

For large T, the bound and the total variation distance are asymptotically equivalent. 

The relationship between coupling, maximal coupling and total variation distance 
[10, 21, 9] (in particular Aldous’ inequality) immediately establishes sharp bounds on 
the coupling rate for Kolmogorov diffusions, with coupling rate depending on the extent 
of agreement between low-index initial conditions. 

Theorem 1. (Maximal coupling rate for the generalized Kolmogorov diffusion.) If 
/(^), are coupled copies of the generalized Kolmogorov diffusion begun at 
respectively, with zq = ... = Zr-i = 0 and Zr 0 for £ = — x^^\ and r denotes 

the coupling time, then 


P[r>T] > P |fV(0,l)| < ^\\L-^KR(T-^)^\\ - O 

and this sharp lower bound is achieved by a maximal coupling between and 

Proof. It is classical that the maximal coupling achieves the upper bound provided 
by total variation distance [10, 21, 9]. Hence this result follows directly from (9) and 
the lower-triangular nature of H and L and hence of Lff^ II_. From (9), P[t > T] is 

controlled by P |7V(0,1)| < H 1Z(T'“^) 2 || ■ But if the first r indices of 2 

vanish then || 2.\\ = 0{T~^), and the result follows from the naive bound 

This has implications for efficient (albeit possibly non-Markovian) coupling strate¬ 
gies for the generalized Kolmogorov diffusion: coupling will occur much faster if the 
initial states give the same initial Brownian locations and agree up to the first r — 1 > 0 
iterated integrals. 
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3. Markovian couplings of Kolmogorov diffusions 

In this section we consider the rate of coupling for Markovian couplings of Kol¬ 
mogorov diffusions: can Markovian couplings be maximal in this case? or efficient? 
or, failing either of these, can there be Markovian maximal couplings of Kolmogorov 
diffusions which are optimal in the sense of coupling faster than any other Markovian 
couplings? We shall see that the answers to the first two of these questions are negative 
for Kolmogorov diffusions of whatever positive index. For the third question we shall 
consider only classic (index one) Kolmogorov diffusions, and show that at least in this 
case the answer is again negative. 

3.1. Markovian Kolmogorov couplings are never m 2 Lximal 

For a very general class of Markov processes, if a Markovian maximal coupling 
exists for two copies of such a process started from distinct points, then this coupling 
must satisfy some specific properties. Maximal coupling has to occur at the space-time 
interface defined by equality of the transition probabilities of the two coupled diffusions. 
If in addition the coupling is Markovian, then Varadhan asymptotics show that at a 
given time the interface must be of a specific form (a hyperplane, if the diffusive part 
of the diffusion is Brownian). Both properties follow from ‘soft’ arguments and do 
not depend explicitly on the specific process being considered. This is the content of 
Section 1.1 of [4] and the results there can be used to show the following: 

Theorem 2. There is no Markovian maximal coupling for the Kolmogorov diffusion 
of any positive index, started from any pair of distinct points. 

Proof. Consider the implications of the existence of a Markovian maximal coupling 
between two index k generalized Kolmogorov diffusions if' and , begun at different 
starting points, xf and xff respectively. Suppose that the coupling time is r. 

First consider the case where the last coordinates of gff and xff differ. By linearity, 
we may suppose that + xff =0 (the zero vector). By (8), for any fixed t > 0, 

Here are (fc -I- l)-dimensional Brownian motions which are coupled (not neces¬ 
sarily in a Markovian manner). The Gaussian distributions of P^(t) have the same 
non-singular variance matrix tVff) = t Iff) \fI2.{t): the corresponding probability 
densities agree on a hyperplane TLit] , xff) which runs through 0 and is orthogonal 
to the vector given by the vector expression 

f R{t) R{t) KRf-f z = t-^ R{t-fZ~^ KR{t~fz, 


where ^ — xff. Note that invertibility of V_ and If imply that this vector is 

non-zero. 

It is convenient to scale this vector expression by and so to deduce that the 

hyperplane Kit; xf, xff) is normal to the vector 

(t'^Rit-f) Kit'"R{t-f)z, ( 10 ) 


where t^R{t ^) is a diagonal (A: -I- 1) x (fc -|- 1) matrix whose diagonal is composed of 


t^, t'^-f 


,,t, 1. 
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The hyperplane 'H(t; Xq ,^q ) separates into disjoint half-spaces, one containing 
£^{0) and the other containing /~(0). We call the respective half spaces Xg , Xq ) 
and 'H~ {t; 3iQ , £^). Now the following observations must necessarily hold for a Marko¬ 
vian maximal coupling fj, of with coupling time r: 

1. Denote the probability densities corresponding to r^{t) by •)• Let at{-) = 

ptiSjor) — Piisior)- Then [4, Lemma 2] states that, for any Borel measurable 
set A G 

p-{£^ (t) G A, T > t} = / af{£) dz 

J A 

where a~^{z_) = max{pj^(xjj", 2 ) — Pi^ {£0 , £), 0} and a~{£) = max{p^(£^,£) — 

Thus, if coupling has not been successful by time t, then at time t the diffusions 
(t) must lie on different sides of the hyperplane 'H{t; Xq,£^)- Moreover we may 
use the Gaussian nature of r^{t) to deduce that under the conditioning t > t 
the support of (t) (respectively /~(t)) must be the whole of 'H+(f;Xg ,Xq ) 
(respectively 'H~ {t; , £^)). 

2. Let pt denote the joint law of the coupled evaluated at time t. The coupling 
is Markovian and maximal, so [4, Lemma 3] shows that for almost every pair 
{£^,x~) of distinct points in the support of pt it must be the case that the 
forward processes £^{1 -f •) must generate a new Markovian maximal coupling 
starting from and respectively. We will denote the set of such 

by M{pt). 

The first observation shows that, for any positive t, s, when conditioned on r > 
t -I- s, the support of the conditional distribution of l£{t -I- s) must be the whole 
of TL^{t + )■ Adding the second observation, we may deduce that for all 

{x^ tX.~) G At (/Xt), when conditioning on £^ (t) = x’*' and !£ (t) = , then the support 

of the conditional distribution of £^ {t -I- s) (also given t > t -\- s) must be the whole 
of 'H^(s;^“'", x“), where 'H(s] x£,gi~) is the hyperplane on which the densities at time 
s agree for two generalized Kolmogorov diffusions begun at and 1 “ respectively. 
Thus for ) G Ai{pt) we must have TL^it + s; ,£o) = x '^, x ~). 

Arguing as above, the common hyperplane must be normal to the vector 

(s^Ris-^)) K{s'"R{s-^))ix'^ - X-) 

= is^Ris~^)) (s'^Ris-^)) X {s-^R{s)) K {s^R{s-^)){x£ -X-). 

Now consider the limit of this vector as s j, 0. We see that (s*D(s“^)) (s*i2(s~^)) 

converges to a matrix all of whose entries are zero save for the (k, k) position. Moreover 
this entry is that same as the {k,k) entry of X~^' since _L is symmetric positive- 
definite, it follows that is also symmetric positive-definite, and therefore its {k, k) 
entry must be positive. 

On the other hand, consider the lower-triangular matrix {s~^D_{s)) H_ {s^R{s~^)). 
All off-diagonal entries must tend to zero with s; the on-diagonal entries are not affected 
and, by (5), all are equal to 1. 
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These facts, along with the assumption that the last coordinate of ^ is non-zero, 
imply that as s I 0 so {s^D_{s~^)) V_~^ H_ — ^) converges to a vec¬ 

tor parallel to (0,0,..., 0,1)"'", and this vector is non-zero if s > 0. Consequently 
5:^0 ) i® normal to tj. = (0,0,... ,0,1)^ for all positive t. 

Together with (10), this tells us that, for each t > 0, there is a non-zero scalar At 
such that 

Xr^ (t'^Rit-^)) X (t-'^Rit)) Kit’^Rit-^))z = AtCfc 

or equivalently 

it~'"R{t)) K{t'"R{t-^))z = Xtir'^Rit)) X{t-%{t))ek . (11) 

For any 0 < i < k, (11) yields 


. ( 12 ) 

3=0 

Putting i = fc in (12), we find 

k 

A* = . 

3=0 

Substituting this value of At back into (12), we obtain the following equation: 


3=0 


P^,fc 


Vfe.fe 

3=0 




Setting z = 0 in the above equation and comparing coefficients of inverse powers of 
t, the explicit formulae (5) for R and (6) for _F lead to z = 0, and we thus obtain a 
contradiction of the initial hypothesis that Zk ^ 0. 

If the last coordinates of Zg and Xg agree, then consider the largest j < k such that 
I^{0) ^ -(j~(0)- Without loss of generality, suppose {0) > dj~(0). Path continuity of 
the diffusion shows that there must be T > 0 such that the measure (t) — If{t) > 
0 for alH < T} must be positive. But 


4+W - Ik it) 




(4^(sj) - 4 ’ 


and therefore — R {t) > 0 for all 0 < t < T} > 0. 

In particular, /XT{4d(T) ^ I^{T)} > 0. Under the hypothesis of existence of a 
Markovian maximal coupling starting from ^ and Xg , and using [4, Lemma 3], we can 
find G such that the coupled forward processes {Ijl{T + ■),R{T -\- •)) 

started from {x^,Rp) create a Markovian maximal coupling. The previous argument 
applied to this coupling then leads to a contradiction. 
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3.2. Markovian Kolmogorov couplings are never efficient 

As before, let dt{x,y) denote the total variation distance between the laws of 
generalized Kolmogorov diffusions (of the same positive index k) started from x and 
y respectively. From Theorem 1, if the first coordinates of x_ and y disagree then 
dt{x, y) ^ while if they agree then dt{x, y) ~ or even higher powers of 

(if further coordinates agree). That is to say, for efficient (possibly non-Markovian) 
coupling strategies (see Definition 1), the coupling happens much faster when the scalar 
Brownian motions, driving the coupled Kolmogorov diffusions of index k, both start 
from the same point. This turns out to be the key observation in proving Theorem 3. 
The theorem, on the absence of efficient Markovian coupling strategies, will follow as 
a direct corollary of the following lemma. 

Lemma 3.1. Let pt be any successful Markovian coupling of Kolmogorov diffusions 
starting from distinct points x and y whose first coordinates agree. Let t denote the 
coupling time. Then there are positive constants C^,t^ (possibly depending on the 
coupling p.) such that, for all t > t^, 

p{T>f} > ^. (13) 

Proof. Let and denote specific Markovian-coupled copies of the generalized 
Kolmogorov diffusions starting from x and y (with first coordinates agreeing but second 
coordinates disagreeing). This corresponds to a Markovian coupling of the driving 
Brownian motions B and B. Let Kt = ct{/*-^^(s),/*'^^( s) : s < t} be the corresponding 
CT-algebra of events determined by time t. Consider the possibility that p{B(t) = 
B{t)} = 1 for all t. A Fubini argument then shows that 

p{B{t) = B{f) for almost every t > 0} = 1. 

By path continuity of Brownian motion, the above would imply that B and B are 
synchronously coupled /r-almost surely and hence p{fS^\t) — = x — y for all t > 

0} = 1. So a successful Markovian coupling cannot be obtained in this manner. Thus 
for a successful Markovian coupling p there must exist to > 0 such that p{B{tffj ft 
B(to)} > 0. Now, for every t > to, 

p{T>t} = [I(t > t) I Ftf\ ] , 

where E^ represents expectation with respect to the probability measure p. Introduce 
the ordinary Brownian coupling time r* = inf{s > to ^ B{s) = i?(s)}. Evidently this 
happens no later than the Kolmogorov coupling time, so I(r > t) > I(t* > t). As the 
coupling is Markovian, the shifted process {(OtoBis), 6taB{s)) : s > 0^ (when condi¬ 
tioned on Kto) gives a coupling of Brownian motions starting from {B(to), B(to))■ If 
df^{x, y) represents the total variation distance between the distributions of Brownian 
motions starting from x and y at time t, then we know that for all t > 0 and all x, y 
satisfying \x — y\ < 2ffi, 
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There is > to such that, for all t>t^, 

Ef, \B{to) - B{to)\ ; \B{to) - B{to)\ < 2y't - to > , 

where = ^-^^=E^|i?(to) — B{to)\. (The left-hand side converges to 2-\/27reC'^ as 
t —1 oo.) Thus, for t > t^, 

E^[E^[I(T>t) I J-*J] > E4E^[I(r* >t) I J-*J] > E^ [df_\(i?(to),B(io))' 

> 1 ^ \ \B{to) - Bito)\ \Bito)-^to)\ ^ ^ 

■\/27re ^ y/t — to ’ yft--t^ 

> C^{t-to)-^/^ > 

which proves the lemma. 

Theorem 3. There does not exist any efficient Markovian coupling strategy (in the 
sense of Definition 1) for the Kolmogorov diffusion of index k (for k > 0). 

Proof. We argue by contradiction. If there is such an efficient Markovian coupling 
strategy, then there must exist an efficient Markovian coupling p, of the generalized 
Kolmogorov diffusions starting from x and y (with first coordinates agreeing but 
second coordinates disagreeing) with coupling time t. Then, by Lemma 3.1, there 
exist positive constants C^,t^ such that p{t > t} > for all t > t^. 

But, as noted in the discussion preceding Lemma 3.1, since the first coordinates of 
X and y agree we have 

dt{x,y) ~ 

or even faster. Thus efficiency of the coupling strategy (in the sense of Definition 1) 
must fail, proving the theorem. 

Remark 1. Theorem 3 shows that any Markovian coupling strategy is non-efficient, 
in the sense that there exist some pairs of distinct starting points from which it is 
impossible to construct efficient couplings. But it does not imply Theorem 2, which 
shows non-maximality of Markovian couplings from any pair of distinct starting points. 

3.3. Markovian classic Kolmogorov couplings cannot be optimal 

To begin with, recall the simplest version of the Markovian coupling for the classic 
Kolmogorov diffusion [5, 17]. Write U for the process corresponding to the difference 
between the Brownian motions, and V for the difference in the time integrals of the 
Brownian motions. We assume C/q = 1 and Vb = 0; generalization to the case of 
arbitrary distinct starting points should be clear. We write the coupling probability 
measure as fi(u,v) when starting values are Uo = u, Vo = v. 

We describe the coupling strategy of [5, 17] for fi(ip). When U and V have the same 
sign, we apply reflection coupling (so that U evolves as a Brownian motion run at rate 
4). Thus the visits of {U,V) to the axis V = 0 have to occur at isolated instants of 
time: between each pair of visits the particle {U, V) describes a “half-cycle” about the 
origin. Over the cycle, we actually apply reflection coupling until U hits U = ±2“^ 
or y = 0, taking the sign ± as the opposite of the sign of U at the start of the half¬ 
cycle. We then apply synchronous coupling till K = 0, so that U is held constant. As 
a result, \U\ will be no larger than 2“* at the end of the fc**' half-cycle. 
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It is convenient to introduce some notation. Suppose that the fc*'' cycle begins at 
time Sk-i (so So = 0). Then Sk = inf{t > Sk-i : V{t) = 0}. Let Tk = minIS'fe, inf{t > 
Sk-i ■ U{t) = (—1)^2“*}} be the first time that U hits U = (—1)^2“*, or Sk if 
that happens first. Thus [Tk,Sk) is the part of the half-cycle in which synchronous 
coupling is applied: one might think of this as the ballistic phase, while [Sk-i,Tk) is 
the Brownian phase. Note that the ballistic phase will be trivial if the Brownian phase 
hits the (V = 0)-axis (that is, Tk = Sk)- 

The time of successful coupling is the time at which {U, V) hits the origin, and thus 
it is given by 

OO 

r = lim 5'fc = ^(5'fc+i - S'fc). 

k—^oo ' ^ 

Borel-Cantelli arguments show that this limit is finite [5]. Our first task is to determine 
the precise rate of decay of the probability of failing to couple by time t. 

Theorem 4. Under the coupling /i(i,o) described above, the eoupling time r satisfies 

^ < A(i.o){T>t} < ^ fort>l, (14) 

for some positive constants C\,C 2 - 

Proof. Note that scaling arguments show that Sk+i — Sk has the same distribution as 
2 “^S'i, so that the principal computation concerns the rate of decay of /i(i,o){5'i > t}. 

Now S'! = Ti -I- 2V(Ti) = 2 -I- U{s))ds, since either V{Ti) = 0 (when Ti = 

Si) or the velocity of V over [Ti,S'i] is fixed at — 5 . Writing ^ + U = 2W for a 
Brownian motion W started from | (over the time interval [S'ojT'i])) and noting that 

S'! = Ti -I- 2V{Ti) = 4 WAt (since Ti = inf{t > 0 : W* = 0}), we see that S'i/4 can 
be viewed as having the density of the random area under a one-dimensional Brownian 
motion started from a positive level and measured till the first time it hits zero. 

The density for this random area can be obtained from the calculations of [13, 
equation (12)]. Consider a standard (rate 1) Brownian motion W started from a > 0. 
If (Ja = inf{t > 0 : Wt = 0}, then the martingale E [exp(—A fg"" lT(s)ds)|J^t] can be 
used to show that the moment generating function of /g^“ W (s)ds (viewed as a function 
of the starting position a) must solve an Airy partial differential equation. The moment 
generating function can be inverted to yield the distribution of /g “ LT(s)ds: 

1 2^/^ a ( 2a^\ 

r)f £ d,.J = )d„. ( 15 ) 

From (15), it follows that for t > 1, 

6i/3e-2# 1 I ffTa 1 gl/3 1 

< ’■[I »'(.)dx>,J < (Id) 

As S'i/4 has the density (15) with a = |, (16) gives us the following for t > 1: 
24i/3e-A 1 -- rc 1 24^/3 1 

r(i) t^ - 


( 17 ) 
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We now apply this rate computation to r = '^'^-QiSk+i — Sk)■ By Boole’s inequality, 
scaling, and (17), if t > 1 then 


A(i,o) 


y^(»5'fc+i— Sk) > A(i,o) I -^fc+i —Sk > r- — 2 

J fc=0 I 


>fc=0 




k=0 


V2 


< 


24^/3 / ^2 


V2 

1/3 


1 


1 


V2- 1 


1 _ 2-1/6 il/3 > 


and this establishes the right-hand inequality in (14). On the other hand (17) shows 
that if t > 1 then 


A(i,o) IJ2(Sk+i-Sk)>t 
U=o 

and this establishes the left-hand inequality in (14), thus completing the proof. 

The order of decay of the failure probability of coupling by time t is therefore 

this Markovian coupling is far from efficient even when the Brownian motions start 
from different points. In principle similar analyses should be possible for Markovian 
couplings of generalized Kolmogorov diffusions, though in such cases progress is difficult 
because we only know of implicitly defined Markovian coupling constructions for index 
exceeding 2, while analysis for the case of index 2 is complicated [17]. 

We now consider the interesting question, can there be an optimal coupling rate, 
optimizing over all Markovian couplings but obtained by a single Markovian coupling 
of a Kolmogorov diffusion? For a study of this question in the different context of 
coupling for Brownian motion together with a local time, see [16]. In contrast to the 
case of [16], we will prove that such a coupling cannot exist for the classical Kolmogorov 
diffusion. 

Theorem 5. There does not exist an optimal Markovian coupling strategy for two 
copies of the classical Kolmogorov diffusion. 

Proof Fix attention on classical Kolmogorov diffusions which initially differ only in 
their second coordinate. Denote by U the difference between the Brownian motions and 
by V the difference in the time integrals. By scaling and stopping arguments, we may 
concentrate on the situation in which Uq = 0 and Vq = 1. Our strategy will be to obtain 
for each t > 0, a specific Markovian coupling with coupling time such that 
/i(*){r(*) > t} < Y for all t > 0, where the constant C does not depend on t. If there 
were to exist an optimal Markovian coupling v with coupling time r*, then the above 
would imply that it should satisfy > t} < ^ for all t > 0. This would contradict 
Lemma 3.1, which shows that for any (successful) Markovian coupling v there must 
exist positive constants C„,t,, such that for all t > t„, vfr* > t} > . 

First we must describe the coupling pfdd. The coupling pfd) differs from the coupling 
described at the start of this subsection essentially only by early completion of the 
Brownian phase of its initial half-cycle: 

1. Couple the Brownian motions by reflection till time 

T[ = inf{t >0:Ut = -|} A inf{t > 0 : Kt = 0} . 


> 


//(i,o) {‘5'i > t} 


> 


24i/3e-A 1 
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Then couple synchronously till time 

51 = inf{t > T{ : 14 = 0} . 

2. Starting from (Usd 0); employ the coupling A(c/si.o) described at the start of this 
subsection. 

In the following, Ci,C 2 , - ■ ■ will be positive constants that do not depend on t. 

First observe that T{ is smaller than the hitting time of the (rate 4) Brownian motion 
U on the level j when started at 0. Therefore arguments using the reflection principle 
show that > 1} < ^. On the other hand, 

ViTi) = [ ' (C/(s) + 4/t)ds - {4/t)T[ < [ ' (C/(s) + 4/t)ds , 

Jo Jo 

and we can apply (15) to deduce that /rd){VT/ > 2} < Now 5i — T{ < | on the 
event {Vt^ < 2}. By scaling arguments and Theorem 4, 

C' ^ Him {’■ > 

= > 1 } 

where r denotes the coupling time under the coupling strategy described at the start 
of this subsection. 

Combining the above facts, we obtain /rdlji-dl > t + 1} < The theorem follows. 

4. Efficiency for the finite-look-ahead coupling 

Despite these negative results, nevertheless it is possible to exhibit a simple and ex¬ 
plicit efficient coupling strategy if we allow couplings which are allowed finite but vary¬ 
ing amounts of precognition. In this section, we will describe a simple non-Markovian 
coupling strategy which achieves efficiency. Our approach will be to divide time into 
successive intervals [5„, 5„+i] of growing size and then to couple the driving Brownian 
motions (and hence the Kolmogorov diffusions) according to a non-Markovian recipe 
on each such interval. We call this coupling a finite-look-ahead coupling as the coupling 
construction on each interval [5„, 5„+i], although non-Markovian, requires information 
on the driving Brownian paths only till time 5„+i. Further, the coupling of the future 
paths of the Kolmogorov diffusions conditional on the paths run up till time 5„ depends 
on the past only through the values taken at time 5„. Thus the coupling restricted to 
times Sn (for n = 0,1,...) can be considered to have a Markovian property. 

Recall that in (8), we wrote down a representation for the Kolmogorov diffusion in 
terms of a Brownian motion vector JF at time T. Note in particular that the lower- 
triangular form of L means that (8) represents Io{T) by Io{T) = xq-\- IFo(T). However 
this holds only for the stipulated fixed time T ; we cannot maintain the representation 
(8) of I(T) in terms of the full vector JV of independent standard Brownian motions 
while simultaneously writing /o(t) = xg -h Wg{t) for 0 < t < T. The representation 
(8) can be best understood as a fragment of an infinite-dimensional representation 


< 


^3 

t ’ 
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as follows. Consider the initial Brownian path {B{t) : 0 < t < T} as a Gaussian 
vector in the infinite dimensional space (^([OjT]) of continuous paths. Realize this 
as the evaluation at time T of an infinite-dimensional Brownian motion B starting 
at the zero path and taking values in the Banach space (^([Ojr]), and evolving in 
“algorithmic time” (as opposed to the “process time” t used for the stochastic process 
{B{t) : 0 < t < T}). A candidate for this is given by the Karhunen-Loeve expansion 

OO 

= xo+'^'/^WkiC) fk{t/T) (18) 

k^l 

for C,t £ [0,r], where Xk = ^ , fk{t) = •\/2sin((A:— \)'xt) denote the eigen- 

values and eigenfunctions respectively of the covariance kernel of Brownian motion 
viewed as a bounded operator acting on L^[0,1], and the WkS are independent standard 
Brownian motions. Here C represents the algorithmic time and t represents the process 
time. (See [7] for more on stochastic analysis on infinite dimensional spaces.) 

Set /o(t) = xq +B{T,t). For 1 <r <k and t £ [0,r], the iterated integrals Ir{t) 
can be viewed as Ir{T,t) where {2r{C,t) '■ C,t £ [0,T]} have the representation 

^2 -j-r 

= Xr + tXr-l + —Xr-2 + ■ ■ ■ + -rXo +T'^'^'/hWk{0 fr,kit/T) (19) 
2 r! 

with fr,kit) = fg ■■■ fk{so) dsodsi... ds^-i- 

It follows from (8) and (19) that, for a fixed process time T, the random process 
I W(C) : C G [Oj^]} obtained by 

MO = {L{C„T) - ^{T) ^ (20) 

is a standard (fc -I- l)-dimensional Brownian motion evolving in algorithmic time up to 
time T. Further, from the representation in (19), it follows that the Brownian motion 
W_, obtained in this way, does not depend on T. Thus we can write the components 
Wj of _W as linear combinations of the Brownian motions Wk'- 

OO 

WjiO = J2'^k{Oej,k, ( 21 ) 

k=l 

where the Cj^k do not depend on T. 

It will be convenient to define the infinite matrix A whose (j, fc)-th entry is given 
by and to note that its rows must form an independent orthonormal set in the 
sequence space P. 

We now describe the finite-look-ahead coupling by constructing the coupled paths in 
each of the successive time intervals (look-ahead-blocks) [Ti -|- ... -I- Tn, Ti -|- ... -I- Tn+i] 
using the Brownian motion on the infinite-dimensional space (7(0, T„] described 
in (18), for T„ = a” for some fixed a > 1. In the following, set Sq = 0 and S'„ = 
Ti -b ... -I- Tn to be the time of initiation of the n**' look-ahead-block. 

Before commencing detailed analysis, we provide a brief heuristic description of the 
coupling. Recall that the reflection coupling of Brownian motions started from distinct 
points gives a maximal coupling by using reflection on one Brownian path to produce 



Coupling the Kolmogorov Diffusion 


17 


the other (till the coupling time), using the hyperplane that bisects the line joining 
their respective starting points. Although the Kolmogorov diffusion {/(t) : t G [0, T]} is 
not a Brownian motion in process time, the process {JK(C) : C G [0)^]} obtained from 
/(C, T) in (20) for a fixed process time T evolves as a Brownian motion in algorithmic 
time. With this observation in mind, we couple two Kolmogorov diffusions and 
started from and respectively on the block [0,r] as follows: 

1. Couple two infinite-dimensional Brownian motions '■ G [0,T']} 

and : i^,T G [0,T]} in snch a way that the processes and 

obtained from Bn '^ and Bn ^ respectively via (19) and (20) are reflection 
coupled (in algorithmic time) by reflecting in the hyperplane bisecting the initial 
discrepancy vector _1K^^^(0) — 

2. Repeat this construction on each block updating the starting points 

of the respective Kolmogorov diffusions to and 7*-^^(S'„). 

This coupling of the infinite-dimensional Brownian motions projects down to a coupling 
of the corresponding driving Brownian motions (and hence the Kolmogorov diffusions) 
by setting 

/«(t) = (22) 

for t G [Sn, S'n+i] and i = 1, 2. 

It is reasonable to expect that if such a coupling is achieved then it will be efficient, 
as we are using reflection coupling of Brownian motions (in algorithmic time) in each 
block. The coupling is analysed in what follows, and efficiency of the coupling is shown 
in Theorem 6. 

We will give an inductive description of the coupling for Kolmogorov diffusions 7^^^ 
and 7^^^ started from and respectively. Suppose we have constructed the 
coupling on [0, An]. If we can synchronously couple the Brownian 

motions 7p^^ and 7 q^^ after Sn- So, henceforth we assume 7^^^(S'„) L^‘^\Sn)- We will 

couple two infinite-dimensional Brownian motions bI^^ and bI^^ on the block [0, r„+i], 
which are represented by the Karhunen-Loeve expansion 

OO 

= ^v^u;JUC)/fc(V7’n+i) (23) 

k=l 


for C) ^ G [0, Tn+i] and i = I, 2. The corresponding coupling for the Brownian motions 
7 q^^ and 7g^^ (and thus for the entire diffusions 7^^^ and 7^^^) on the block [5'„,S'„+i] 
can then be obtained by (22). 

The {k + 1) dimensional Brownian motions running in algorithmic time obtained 
from the iterated integrals by (20) are denoted by and respectively. 

Write = I}^\Sn) — KffSn)- Define the unit vectors v„ = — - 


^taking ^g 


and n = — ~ .- =-= 

/ Wdr^ H n(ry-^) L 


L-'mTn^)zA 

, . Let {Bn i ■ j > 1} be 

independent standard scalar Brownian motions. Recall the matrix A defined just after 
(21). The infinite vector ^ = Mi'" V has ?^-norm one. We can extend it to an 


L 

L~^ z 
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orthonormal basis of P, say {v„ i, 2 • • • }• Let P_ denote the (infinite) orthogonal 
matrix with columns formed by these vectors. Note that P is a unitary matrix, and 
therefore the rows of P are also orthonormal. Now, we can define a coupling of the 

component Brownian motions in terms of the matrix P and the Brownian motions 

Define the stopping time (j„ = inf {^ > 0 : P„,i(C) = — 5II KQiTn+i) Z.n II}- 
Then the coupling is a reflection coupling as follows: 


(* 

n,fc 


W 


(C) 


(-l)*+lPfc.lP„.l(C) + E“2 Pk,jBr.,,{C) 

= l Pk,j {Bn,jiO ~ BnjicTn)) 


when C, < On, 
when C, > an , 


(24) 


for z = 1, 2. This gives the coupling between Bn'^ and Bn \ and thus the corresponding 
Kolmogorov diffusions and via (23). 

Note that the reflection coupling recipe (24) along with (21) give us 


W^\0-W(^H0 = 2P„,i(CAa„)^^. (25) 


Thus the (A + l) dimensional Brownian motions and (running in algorithmic 
time) are coupled in such a way that their difference is a rate 4 scalar Brownian motion 
running along the vector zy . This will be the crucial fact we will use to prove efficiency 
of this coupling strategy in the following theorem. 


Theorem 6. (Efficient finite-look-ahead coupling strategies.) The finite-look-ahead 
eoupling of the Kolmogorov diffusion of index k, eonstructed as above over suecessive 
intervals of lengths Tn = a" for some fixed a > 1, provides an efficient coupling strategy 
for the Kolmogorov diffusion. 


Remark 2. An efficient coupling strategy for the Kolmogorov diffusion has to couple 
at a much faster rate when the initial discrepancy vector 2 has an initial segment 
zq, Zi,..., Zr-i which is zero: see Theorem 1. 


Proof. Consider the above coupling between two index k generalized Kolmogorov 
diffusions PP, begun at xP\ x^^ respectively, with 2 = x^^ — Setting 

_Zq = 2 = xP'> — xP'^ and Sq = Tq = 0, and employing the representation (20), we may 
write for n > 1, 


ZniO = L^^\Sn-i+0-L^^HSn-i+0 = £(T„) (^£(p-') (r„_i)A„ J£ 

(26) 

where A„ J£(C) = JE}}2 i(C) - and C G [0,T„]. We will write for 2„(P„) 

and AnW_ for A„ JT(P„). 

SetPo(O) = \\L.-^z\\ andz/Q = = . For n > 1, writeP„(C) = \\ L.-^ D{T-^) 

\\L All - — 

L-^D{T-^)Z^ (L-ipp(a-i)P)zz„ 

and recall the unit vectors zz„ = ~ , — -- and r? = — ~ ,- =-=- 

(defined when L^^Sn) As before, we will write P„ for Fn{Tn). 

Then (26) gives us the following: 


FniO = \\L-^BPn")^n{ 0 \\ = II ^ ') ln-1 +A„ J£(C) II 

= l|K-i(P-'PP(a-')P)A„-i + A„iT(C)||. (27) 
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Suppose ^ Then from (25), we have 


A„i£(C) = 2 A . 

Substituting this into (27), and noting the definition of the stopping time we get 


Fn{0 


^Tn_i + 2 


Bn-l,l{C ^ CTn-l) 

\\{L-^KB0^-^)k)Kn- 



ll(F'^£(«”')^kn-ill- (28) 


Suppose the coupling has not been successfnl in [0,S'„]. Then putting C = T„ in (27) 
and using (25) again, we proceed similarly as above to obtain the following recursive 
relation: 


FniLn 



2 Bn-l,l{Tn A (Tn-l) \ 




(29) 


Thus, for this sequential coupling scheme it follows from (29) that the vector is 
deterministic, and indeed 


{L H _ D{a ^)T)£„_i 


{L-\HD{a-^)rL)v^ 


||(^-'l£(a-i)^)£„-ill \\{L-\Kma-^)Y L)u.r^\\ 

So we can re-express (28) in terms of Vq'. 

i?„_i,i(CAa„_i) \ ||(r'(l^(«“'))"^koll 


Fn{0 = K_i + 2 - 


V \\{L-^Kg{a-Yk)Fn-i\\) 

Consequently, if we define 

FniO 


GniO = 


for ^ G [0,T„], then (31) gives us 


WiL-YHDia-YY LY,\\ 


Gn(0 = (g^_i+2 - ^n-l,l(CAo-„-i) - 

I UrYBMa-YYLYo 


(30) 


(31) 


(32) 


for ( G [0,T„], where as before, we write G„_i for G„_i(T„_i). 

Consider now the (deterministic) evolution of the length \\{L~^{^D_{a~YY kYY- 
The matrices LT^ ILD.{oi~Y L and H_D_{a~Y share the same eigenvalues. Since H_ is 
lower-triangular, and has values 1 along the diagonal, and since a~^ G (0,1), it follows 
that the eigenvalues of L~^ ^D,{a~^) L are distinct and positive, and are 1, a~^, ..., 
a~^. Consequently it follows that LT^ iLD_{a~Y L has A: -I- 1 distinct eigenvectors; let 
S. 1 , • • •: e.fe be the eigenvectors corresponding to the eigenvalues 1 , a~^, ..., a“^; we 
suppose these to be of unit Euclidean norm. Note in particular that = (1, 0,..., 0)^ 
while e^, ..., all have zero initial coordinate (this follows from the lower-triangular 
nature of the matrices L, ff, J2.(ck~^); moreover the explicit construction of L implies 
that its top row is given by (1,0,..., 0)). Write = 70 ^g -I -71 + ■ ■ ■ + Jk 
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We suppose first that 70 ^ 0. From the above it follows that 

\\^\KUior^)T hvA = ||7oeo+7i«-”ei + ...+7fea-'=”e,||, (33) 

which by the triangle inequality lies in the range 


I 70 I - max{| 7 i| : i = 1 ,..., k}a ” 

< Wr^KMc^-^TL'^oW < 

|7o| +max{|7i| = , fc}a“” . (34) 


Choose no such that the lower bound of this range exceeds 0 for n > no- 

Observe from the “algorithmic time” interpolation (32) that if we set G(0) = Fo(0) = 
II II and G{() = Gn{C — Sn-i) for ( G [S'„_i,S'„], then G is driven by a time- 
changed Brownian motion bnilt up from the increments 


2 


-^n —l,l((C — 1 ) ^n — 1 ) 

|U-i(^£(a-i))"^no|| 


for 5'„_i < C < Sn- 


Moreover, G{() — 0 for some ( G (this happens when ( — Sn-i = cr„_i) 

if and only if = 0 and thus the Kolmogorov diffusions and (running in 
“process time”) couple by time Sn- The ratio Sn/Sn-i always equals a, so a simple 
asymptotic analysis shows that for coupling efficiency considerations it is sufficient to 
analyse the hitting time of zero for the process G. 

We see from (34) that by time 5'„ = a-\-.. .-t-a” the intrinsic time of the time-changed 
Brownian motion will lie between the two values 


no —1 

E 


da'’ 


E 


da" 


^ ' ^0 (l 7 o|Tmax{| 7 i| :i = l,...,fc}a 


1^1 lU” 

and so the probability of G not yet having hit zero will be of order 
Cl , / „ 4a 


- 1/2 




< 




E 


(I 70 I T max{| 7 i| :i = 1 ,..., kja 


< 


C 2 


for positive constants Gi, G 2 , G 3 . 

Asymptotic analysis for large n now shows that efficiency follows in the case 70 0. 

Suppose 70 = 0, so the first coordinate of 2 must vanish. In case the first r > 0 
coordinates of £ vanish, the deterministic evolution of given by (30) together with 
the lower-triangular nature of ^D_{a~y L ensures that the first r coordinates of 
also vanish, for all n. Hence, 




(35) 


We now argue much as above, but working with the rather simpler comparison process 
G{t) = 2a"”H„_i4((C - 5„-i) A a„_i) + ani(^-'^£(a-')^)l^„_il|G(5„_i), 

for < C < Sn, where G(0) = || F~^ 2 ||. 
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From (35), a'^\\{L ^ H_D_{a ^)i)£„_i|| < 1. Thus, the jumps of the process G at 
the times {5'j}i>o decrease the value of G. Hence, the hitting time of zero for iG'(S'„) 
is dominated by that of a new Brownian motion run till time 

+ . . . + a2"^T„ = a^r+l ^n(2r-+l) _ (^^-2^(„+l)(2r+l) ^ 


This avoids 0 up to this time with probability of order Since Sn ~ 

C2a"+\ we deduce that the asymptotics for the probability of avoiding zero before Sn 
are of order at most 

As a consequence it follows that this finite-look-ahead coupling provides an efficient 
coupling strategy, taking efficient advantage of faster coupling when an initial set of 
coordinates of ^ vanish. This completes the proof. 

Suppose we require the range of precognition to be bounded throughout the coupling 
procedure: is it still possible to produce efficient couplings, so long as the Brownian 
components of the two coupled Kolmogorov diffusions start at different points? We 
do not yet have a general answer to this question, but can show that the obvious ap¬ 
proach cannot work. Consider the above finite-look-ahead coupling of the Kolmogorov 
diffusion of index k, defined over successive intervals of length = 1 (so in particular 
the look-ahead is bounded). Then this coupling may not even succeed! 

We justify this assertion by proceeding as in the proof of Theorem 6, but taking 
T„ = 1, so Sn = n. Then (28) and (30) simplify: we obtain 


FuiO = KL)En-l\\Fn-l + 2Bn-lAC ^ <^n-l) , 

{L~^KL)Kn-i L~^K{n)z 

KL)iLn-i\\ ^ \\L-^K{^)^\\' 

Accordingly we obtain 

\\L~^H{n)z\\ 

FniO = II 7T ^Fn-1 + 2Bn-l,l{C ^ <^n-l) ■ 

\\L K(n-l)z\\ 


(36) 


But asymptotically (considering the action of H (n) on ^ for large n) 

\\L-^mn)z\\ f n y 

IU-^K(n-l)zlj ^ [n-lj 


(note that the index k satisfies fc > 1). Therefore the time-changed Brownian interpo¬ 
lation of the Fnjn^ (following the proof of Theorem 6) may be compared with a scalar 
Brownian motion run up to time 



n 


This sum converges, and therefore there is a positive probability of the Brownian 
motion not reaching zero. 
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5. Conclusion 

In this paper we have studied rates at which the (generalized) Kolmogorov diffusion 
can be coupled using Markovian or near-Markovian couplings. While this diffusion 
does possess successful Markovian couplings [5, 17], such couplings cannot be maximal 
(Theorem 2), nor can there be an efficient Markovian coupling strategy (Theorem 3). 
Moreover, at least in the classical case, there can be no optimal Markovian coupling 
(Theorem 5). By way of compensation, it is possible to exhibit a simple efficient finite- 
look-ahead coupling strategy even for the generalized Kolmogorov diffusion (Theorem 
6 ). Thus a controlled amount of anticipation suffices to obtain efficiency of coupling. 

The results of [4] show that smooth elliptic diffusions only admit Markovian maximal 
couplings in cases of very special geometry. The Kolmogorov diffusion is the simplest 
non-trivial nilpotent diffusion, and so its failure to admit Markovian maximal couplings 
suggests the conjecture that no smooth non-trivial nilpotent diffusions can admit 
Markovian maximal couplings. Certainly this seems plausible for the case of planar 
Brownian motion plus Ito stochastic area [5, 14, 15]: perhaps the conjecture can be 
resolved in the manner of [4], using ideas from the geometry of nilpotent groups. 

The above results can be compared with those of [16], concerning the coupling of 
scalar Brownian motion together with its local time at zero. That case lies outside the 
range of nilpotent diffusions, however it does seem to provide relevant insights. For the 
local time coupling a (partly numerical) argument shows that there is no Markovian 
maximal coupling, but a control-theoretic argument shows that a simple reflection / 
synchronous coupling is optimal Markovian. It would be most interesting to determine 
whether the existence of efficient Markovian couplings for a given smooth diffusion, or 
of optimal Markovian couplings, can enforce geometric rigidity in a manner similar to 
the existence of maximal Markovian couplings. (The results in [6, Sections 3, 4] can 
be viewed as providing a very preliminary exploration to this kind of problem in the 
special case of reflecting Brownian motion in a compact convex domain.) This would 
further elucidate the enigmatic role of geometry in probabilistic coupling theory. 

A further detailed question for future research is, whether it is possible to attain 
efficiency for a bounded horizon ffnite-look-ahead coupling for a Kolmogorov diffusion. 
The coupling described in Theorem 6 has a horizon which extends at a geometric rate 
over successive blocks: as noted after the proof of Theorem 6, the obvious bounded 
horizon coupling does not even have the property of being successful. There are 
instances in which a small amount of look-ahead can have a dramatic influence on 
coupling rate - see the work of Smith [23] on coupling for a Gibbs’ sampler on the 
simplex - it would be very interesting if one could map out the circumstances in 
which this might apply in the relatively well-behaved case of smooth diffusions. As 
exemplified in [23], probabilistic coupling has a large part to play in the analysis of 
random algorithms; it would be of considerable advantage to gain some case history 
for the potential of modestly non-Markovian coupling to deliver faster couplings in the 
amenable instance of smooth diffusions. 
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